SemMT: A Semantic-Based Testing Approach for Machine Translation Systems

نویسندگان

چکیده

Machine translation has wide applications in daily life. In mission-critical such as translating official documents, incorrect can have unpleasant or sometimes catastrophic consequences. This motivates recent research on testing methodologies for machine systems. Existing mostly rely metamorphic relations designed at the textual level (e.g., Levenshtein distance) syntactic distance between grammar structures) to determine correctness of results. However, these do not consider whether original and translated sentences same meaning (i.e., Semantic similarity). Therefore, this paper, we propose SemMT, an automatic approach systems based semantic similarity checking. SemMT applies round-trip measures sentences. Our insight is that semantics expressed by logic numeric constraint be captured using regular expressions (or deterministic finite automata) where efficient equivalence/similarity checking algorithms are available. Leveraging insight, three metrics implement them SemMT. The experiment result reveals achieve higher effectiveness compared with state-of-the-art works, achieving increase 21% 23% accuracy F-Score, respectively. We also explore potential improvements achieved when proper combinations adopted. Finally, discuss a solution locate suspicious trip translation, which may shed lights further exploration.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Web based Machine Translation

This paper describes the experimental combination of traditional Natural Language Processing (NLP) technology with the Semantic Web building stack in order to extend the expert knowledge required for a Machine Translation (MT) task. Therefore, we first give a short introduction in the state of the art of MT and the Semantic Web and discuss the problem of disambiguation being one of the common c...

متن کامل

Knowledge-Based Semantic Embedding for Machine Translation

In this paper, with the help of knowledge base, we build and formulate a semantic space to connect the source and target languages, and apply it to the sequence-to-sequence framework to propose a Knowledge-Based Semantic Embedding (KBSE) method. In our KBSE method, the source sentence is firstly mapped into a knowledge based semantic space, and the target sentence is generated using a recurrent...

متن کامل

A scalarization-based method for multiple part-type scheduling of two-machine robotic systems with non-destructive testing technologies

This paper analyzes the performance of a robotic system with two machines in which machines are configured in a circular layout and produce non-identical parts repetitively. The non-destructive testing (NDT) is performed by a stationary robotic arm located in the center of the circle, or a cluster tool. The robotic arm integrates multiple tasks, mainly the NDT of the part and its transition bet...

متن کامل

A Bilingual Graph-Based Semantic Model for Statistical Machine Translation

Rui Wang, Hai Zhao,1,2⇤ Sabine Ploux, ⇤ Bao-Liang Lu, and Masao Utiyama Department of Computer Science and Eng. Key Lab of Shanghai Education Commission for Intelligent Interaction and Cognitive Eng. Shanghai Jiao Tong University, Shanghai, China Centre National de la Recherche Scientifique, CNRS-L2C2, France National Institute of Information and Communications Technology, Kyoto, Japan wangrui....

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Software Engineering and Methodology

سال: 2022

ISSN: ['1049-331X', '1557-7392']

DOI: https://doi.org/10.1145/3490488